1,142 research outputs found
Histogram of Oriented Principal Components for Cross-View Action Recognition
Existing techniques for 3D action recognition are sensitive to viewpoint
variations because they extract features from depth images which are viewpoint
dependent. In contrast, we directly process pointclouds for cross-view action
recognition from unknown and unseen views. We propose the Histogram of Oriented
Principal Components (HOPC) descriptor that is robust to noise, viewpoint,
scale and action speed variations. At a 3D point, HOPC is computed by
projecting the three scaled eigenvectors of the pointcloud within its local
spatio-temporal support volume onto the vertices of a regular dodecahedron.
HOPC is also used for the detection of Spatio-Temporal Keypoints (STK) in 3D
pointcloud sequences so that view-invariant STK descriptors (or Local HOPC
descriptors) at these key locations only are used for action recognition. We
also propose a global descriptor computed from the normalized spatio-temporal
distribution of STKs in 4-D, which we refer to as STK-D. We have evaluated the
performance of our proposed descriptors against nine existing techniques on two
cross-view and three single-view human action recognition datasets. The
Experimental results show that our techniques provide significant improvement
over state-of-the-art methods
Action Classification with Locality-constrained Linear Coding
We propose an action classification algorithm which uses Locality-constrained
Linear Coding (LLC) to capture discriminative information of human body
variations in each spatiotemporal subsequence of a video sequence. Our proposed
method divides the input video into equally spaced overlapping spatiotemporal
subsequences, each of which is decomposed into blocks and then cells. We use
the Histogram of Oriented Gradient (HOG3D) feature to encode the information in
each cell. We justify the use of LLC for encoding the block descriptor by
demonstrating its superiority over Sparse Coding (SC). Our sequence descriptor
is obtained via a logistic regression classifier with L2 regularization. We
evaluate and compare our algorithm with ten state-of-the-art algorithms on five
benchmark datasets. Experimental results show that, on average, our algorithm
gives better accuracy than these ten algorithms.Comment: ICPR 201
Theoretical Design and Analysis of Multivolume Digital Assays with Wide Dynamic Range Validated Experimentally with Microfluidic Digital PCR
This paper presents a protocol using theoretical methods and free software to design and analyze multivolume digital PCR (MV digital PCR) devices; the theory and software are also applicable to design and analysis of dilution series in digital PCR. MV digital PCR minimizes the total number of wells required for “digital” (single molecule) measurements while maintaining high dynamic range and high resolution. In some examples, multivolume designs with fewer than 200 total wells are predicted to provide dynamic range with 5-fold resolution similar to that of single-volume designs requiring 12 000 wells. Mathematical techniques were utilized and expanded to maximize the information obtained from each experiment and to quantify performance of devices and were experimentally validated using the SlipChip platform. MV digital PCR was demonstrated to perform reliably, and results from wells of different volumes agreed with one another. No artifacts due to different surface-to-volume ratios were observed, and single molecule amplification in volumes ranging from 1 to 125 nL was self-consistent. The device presented here was designed to meet the testing requirements for measuring clinically relevant levels of HIV viral load at the point-of-care (in plasma, 1 000 000 molecules/mL), and the predicted resolution and dynamic range was experimentally validated using a control sequence of DNA. This approach simplifies digital PCR experiments, saves space, and thus enables multiplexing using separate areas for each sample on one chip, and facilitates the development of new high-performance diagnostic tools for resource-limited applications. The theory and software presented here are general and are applicable to designing and analyzing other digital analytical platforms including digital immunoassays and digital bacterial analysis. It is not limited to SlipChip and could also be useful for the design of systems on platforms including valve-based and droplet-based platforms. In a separate publication by Shen et al. (J. Am. Chem. Soc., 2011, DOI: 10.1021/ja2060116), this approach is used to design and test digital RT-PCR devices for quantifying RNA
#REVAL: a semantic evaluation framework for hashtag recommendation
Automatic evaluation of hashtag recommendation models is a fundamental task
in many online social network systems. In the traditional evaluation method,
the recommended hashtags from an algorithm are firstly compared with the ground
truth hashtags for exact correspondences. The number of exact matches is then
used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This
way of evaluating hashtag similarities is inadequate as it ignores the semantic
correlation between the recommended and ground truth hashtags. To tackle this
problem, we propose a novel semantic evaluation framework for hashtag
recommendation, called #REval. This framework includes an internal module
referred to as BERTag, which automatically learns the hashtag embeddings. We
investigate on how the #REval framework performs under different word embedding
methods and different numbers of synonyms and hashtags in the recommendation
using our proposed #REval-hit-ratio measure. Our experiments of the proposed
framework on three large datasets show that #REval gave more meaningful hashtag
synonyms for hashtag recommendation evaluation. Our analysis also highlights
the sensitivity of the framework to the word embedding technique, with #REval
based on BERTag more superior over #REval based on FastText and Word2Vec.Comment: 18 pages, 4 figure
A Comparative Review of Recent Kinect-based Action Recognition Algorithms
Video-based human action recognition is currently one of the most active
research areas in computer vision. Various research studies indicate that the
performance of action recognition is highly dependent on the type of features
being extracted and how the actions are represented. Since the release of the
Kinect camera, a large number of Kinect-based human action recognition
techniques have been proposed in the literature. However, there still does not
exist a thorough comparison of these Kinect-based techniques under the grouping
of feature types, such as handcrafted versus deep learning features and
depth-based versus skeleton-based features. In this paper, we analyze and
compare ten recent Kinect-based algorithms for both cross-subject action
recognition and cross-view action recognition using six benchmark datasets. In
addition, we have implemented and improved some of these techniques and
included their variants in the comparison. Our experiments show that the
majority of methods perform better on cross-subject action recognition than
cross-view action recognition, that skeleton-based features are more robust for
cross-view recognition than depth-based features, and that deep learning
features are suitable for large datasets.Comment: Accepted by the IEEE Transactions on Image Processin
Recommended from our members
The pattern of foreign property investment in Vietnam: the apartment market in Ho Chi Minh City
As globalization proceeds, transnational property development is increasing. Especially in emerging markets, foreign developers’ influence in changing the local landscape is becoming significant. In this research, the behavioral patterns of foreign developers in the apartment market of Ho Chi Minh City, Vietnam were identified. To understand the dynamics of foreign developers, the types of products that were being created, where the investments were located, and the differences in development strategies adopted by foreign developers in comparison to domestic counterparts were identified. To accomplish this, data on apartment projects and statistics were collected, and a series of spatial analyses including sieve mapping, histogram analysis, factor analysis and logistic regression was conducted. In addition, closer examination was made of specific cases to understand the dynamics among foreign and domestic developers, also allowing the identification of some regularities in the patterns of foreign developments. Besides presenting detailed results, this paper also seeks to account for the conditions that appear to have generated these patterns and characteristics
Hallucinating IDT Descriptors and I3D Optical Flow Features for Action Recognition with CNNs
In this paper, we revive the use of old-fashioned handcrafted video
representations for action recognition and put new life into these techniques
via a CNN-based hallucination step. Despite of the use of RGB and optical flow
frames, the I3D model (amongst others) thrives on combining its output with the
Improved Dense Trajectory (IDT) and extracted with its low-level video
descriptors encoded via Bag-of-Words (BoW) and Fisher Vectors (FV). Such a
fusion of CNNs and handcrafted representations is time-consuming due to
pre-processing, descriptor extraction, encoding and tuning parameters. Thus, we
propose an end-to-end trainable network with streams which learn the IDT-based
BoW/FV representations at the training stage and are simple to integrate with
the I3D model. Specifically, each stream takes I3D feature maps ahead of the
last 1D conv. layer and learns to `translate' these maps to BoW/FV
representations. Thus, our model can hallucinate and use such synthesized
BoW/FV representations at the testing stage. We show that even features of the
entire I3D optical flow stream can be hallucinated thus simplifying the
pipeline. Our model saves 20-55h of computations and yields state-of-the-art
results on four publicly available datasets.Comment: First two authors contributed equally. This paper is accepted by
ICCV'1
- …